Search CORE

8 research outputs found

SQ Lower Bounds for Learning Bounded Covariance GMMs

Author: Diakonikolas Ilias
Kane Daniel M.
Pittas Thanasis
Zarifis Nikos
Publication venue
Publication date: 22/06/2023
Field of study

We study the complexity of learning mixtures of separated Gaussians with common unknown bounded covariance matrix. Specifically, we focus on learning Gaussian mixture models (GMMs) on

\mathbb{R}^d

of the form

P= \sum_{i=1}^k w_i \mathcal{N}(\boldsymbol \mu_i,\mathbf \Sigma_i)

, where

\mathbf \Sigma_i = \mathbf \Sigma \preceq \mathbf I

and

\min_{i \neq j} \| \boldsymbol \mu_i - \boldsymbol \mu_j\|_2 \geq k^\epsilon

for some

\epsilon>0

. Known learning algorithms for this family of GMMs have complexity

(dk)^{O(1/\epsilon)}

. In this work, we prove that any Statistical Query (SQ) algorithm for this problem requires complexity at least

d^{\Omega(1/\epsilon)}

. In the special case where the separation is on the order of

k^{1/2}

, we additionally obtain fine-grained SQ lower bounds with the correct exponent. Our SQ lower bounds imply similar lower bounds for low-degree polynomial tests. Conceptually, our results provide evidence that known algorithms for this problem are nearly best possible

arXiv.org e-Print Archive

Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise

Author: Diakonikolas Ilias
Diakonikolas Jelena
Kane Daniel M.
Wang Puqian
Zarifis Nikos
Publication venue
Publication date: 28/06/2023
Field of study

We study the problem of PAC learning

\gamma

-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms. Concretely, the sample complexity of the problem is

\widetilde{\Theta}(1/(\gamma^2 \epsilon))

. We start by giving a simple efficient algorithm with sample complexity

\widetilde{O}(1/(\gamma^2 \epsilon^2))

. Our main result is a lower bound for Statistical Query (SQ) algorithms and low-degree polynomial tests suggesting that the quadratic dependence on

1/\epsilon

in the sample complexity is inherent for computationally efficient algorithms. Specifically, our results imply a lower bound of

\widetilde{\Omega}(1/(\gamma^{1/2} \epsilon^2))

on the sample complexity of any efficient SQ learner or low-degree test

arXiv.org e-Print Archive

Learning general halfspaces with general Massart noise under the Gaussian distribution

Author: Diakonikolas Ilias
Kane Daniel M
Kontonis Vasilis
Tzamos Christos
Zarifis Nikos
Publication venue: eScholarship, University of California
Publication date: 08/11/2021
Field of study

We study the problem of PAC learning halfspaces on

\mathbb{R}^d

with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point

\mathbf{x}

with unknown probability

\eta(\mathbf{x}) \leq \eta

, for some parameter

\eta \in [0,1/2]

. The goal is to find a hypothesis with misclassification error of

\mathrm{OPT} + \epsilon

, where

\mathrm{OPT}

is the error of the target halfspace. This problem had been previously studied under two assumptions: (i) the target halfspace is homogeneous (i.e., the separating hyperplane goes through the origin), and (ii) the parameter

\eta

is strictly smaller than

1/2

. Prior to this work, no nontrivial bounds were known when either of these assumptions is removed. We study the general problem and establish the following: For

\eta <1/2

, we give a learning algorithm for general halfspaces with sample and computational complexity

d^{O_{\eta}(\log(1/\gamma))}\mathrm{poly}(1/\epsilon)

, where

\gamma =\max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \}

is the bias of the target halfspace

f

. Prior efficient algorithms could only handle the special case of

\gamma = 1/2

. Interestingly, we establish a qualitatively matching lower bound of

d^{\Omega(\log(1/\gamma))}

on the complexity of any Statistical Query (SQ) algorithm. For

\eta = 1/2

, we give a learning algorithm for general halfspaces with sample and computational complexity

O_\epsilon(1) d^{O(\log(1/\epsilon))}

. This result is new even for the subclass of homogeneous halfspaces; prior algorithms for homogeneous Massart halfspaces provide vacuous guarantees for

\eta=1/2

. We complement our upper bound with a nearly-matching SQ lower bound of

d^{\Omega(\log(1/\epsilon))}

, which holds even for the special case of homogeneous halfspaces.Comment: Revised presentatio

arXiv.org e-Print Archive

eScholarship - University of California